Parameterising Jobscripts¶
Now that we know how to set a script directly, we can at least run jobs on the nodes.
However this method is rather inflexible, and it would be nice to be able to parameterise these scripts in some way.
For that, we need to enable some level of parameterisation.
Templates¶
The simplest way of doing this is to use a template.
Assuming you have a jobscript already, you’re most of the way there.
Lets take our script from earlier and create a template from it.
To do this, we just need to identify anything that we would want to change, and add a placeholder for a parameter.
Placeholders¶
These placeholders have a specific format. remotemanager
will create value entries for you, by searching for anything of the form #VALUE#
.
For example, for our nodes
parameter, we should change the line
#SBATCH --nodes=4
to
#SBATCH --nodes=#NODES#
Lets do the whole script:
[2]:
jobscript_template = """#!/bin/bash
#SBATCH --ntasks-per-node=#TASKS_PER_NODE#
#SBATCH --cpus-per-task=#CPUS_PER_TASK#
#SBATCH --nodes=#NODES#
#SBATCH --queue=#QUEUE#
#SBATCH --account=#ACCOUNT#
#SBATCH --walltime=#TIME:format=time:default=3600#
#SBATCH --exclusive
#MODULES#"""
Now we have a parameterised jobscript, how do we use it?
This template can be used by a Computer (or Script) class to generate a script based on your inputs.
Lets go with Computer, since it’s the more commonly used one:
[3]:
from remotemanager import Computer
conn = Computer(template=jobscript_template)
print(conn.script(tasks_per_node=16, cpus_per_task=4, nodes=12, queue="standard", account="test"))
#!/bin/bash
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=4
#SBATCH --nodes=12
#SBATCH --queue=standard
#SBATCH --account=test
#SBATCH --walltime=01:00:00
#SBATCH --exclusive
Note
Note that arguments are always lower case. Even if you specify uppercase in the actual parameterisation.
Note
Don’t worry too much about this script
function. This will be called for you when you’re using a Dataset; you don’t have to generate the scripts yourself.
Now that we have a script that uses parameters, we should cover the ways in which you can alter those params.
Generalisation¶
It’s important to note that these parameters can be anything you want.
As an example of this, we’re parameterising our module load with a #MODULE#
parameter.
This means that we can dynamically change the modules by passing
conn.modules = "module load ..."
We will come back and use this template later in the tutorial. For now, we should cover the various ways you can control how these values behave.
Keyword Arguments¶
If we take a closer look at the walltime
attribute, notice that :format=time:default=3600
string? These are keyword args.
In this case, we are specifying that the output should be formatted as though it is a time string, with a default of 3600 (seconds).
Tip
Add keyword args just like you would in a normal python call, separated by a :
character.
A complete list of kwargs is provided below:
default
¶
This allows you to set the default value of a parameter. If not specified, and no value is given, then the empty_treatment
behaviour applies (shown below)
[4]:
template = "a = #a:default=10#"
conn = Computer(template=template)
print(conn.script())
a = 10
value
¶
You can also set the value
directly. This is roughly equivalent to setting default
, but has a higher priority. For example if you set #PARAM:default="foo":value="bar"
, then “bar
” will take priority, rendering the default essentially useless.
optional
¶
Set to False to enforce that a value is present.
In the following example, we are allowed to give no value for optional
, however we will get an error if we try to generate a script without specifying required
[5]:
template = """
optional = #OPT#
required = #REQ:optional=False#
"""
conn = Computer(template=template)
print(conn.script())
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[5], line 8
1 template = """
2 optional = #OPT#
3 required = #REQ:optional=False#
4 """
6 conn = Computer(template=template)
----> 8 print(conn.script())
File ~/Work/Devel/remotemanager/remotemanager/script/script.py:319, in Script.script(self, empty_treatment, **run_args)
317 # check validity
318 if not self.valid:
--> 319 raise ValueError(f"Missing values for parameters:\n{self.missing}")
320 # generation section
321 self._link_subs() # ensure values are properly linked
ValueError: Missing values for parameters:
['req']
Optional Properties¶
You can see the required values at any moment by accessing the required
property of your Computer
.
Alternatively, the missing
property shows what you still need to specify.
Finally, valid
will be True
only if there are no missing parameters.
[6]:
print("required:", conn.required)
print("missing:", conn.missing)
print("valid:", conn.valid)
required: ['req']
missing: ['req']
valid: False
requires
¶
If a parameter requires another, you can specify that. In the following template, the optional=True
is ignored, since param
says that it requires it.
Added in version 0.13.3: You can now specify multiple requirements with a comma separated list: requires=a,b,c
.
[7]:
template = """
param = #PARAM:default={foo}:requires=foo#
foo = #FOO:optional=True#
"""
conn = Computer(template=template)
print(conn.script())
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[7], line 9
1 template = """
2 param = #PARAM:default={foo}:requires=foo#
3
4 foo = #FOO:optional=True#
5 """
7 conn = Computer(template=template)
----> 9 print(conn.script())
File ~/Work/Devel/remotemanager/remotemanager/script/script.py:319, in Script.script(self, empty_treatment, **run_args)
317 # check validity
318 if not self.valid:
--> 319 raise ValueError(f"Missing values for parameters:\n{self.missing}")
320 # generation section
321 self._link_subs() # ensure values are properly linked
ValueError: Missing values for parameters:
['foo']
replaces
¶
You can also mark a parameter as replacing another. This is less useful, but if you find a use case, it’s there.
Here, you can specify only param
, and foo
will not complain.
Added in version 0.13.3: You can now specify multiple replacements with a comma separated list: requires=a,b,c
.
[8]:
template = """
param = #PARAM:replaces=foo#
foo = #FOO:optional=False#
"""
conn = Computer(template=template)
print(conn.script(param="param"))
param = param
min
/max
¶
Setting the min
or max
will raise an exception if the value steps out of these bounds (even if calculated)
Here, b
returns 10x the value of a
, so calling with a=3
will result in a value of b=30
.
As this is above our set limit, we will get an exception. The same applies to min
, but in reverse.
[10]:
template = """
a = #a#
b = #b:max=20:default={a*10}#
"""
conn = Computer(template=template)
print(conn.script(a=3))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[10], line 8
1 template = """
2 a = #a#
3 b = #b:max=20:default={a*10}#
4 """
6 conn = Computer(template=template)
----> 8 print(conn.script(a=3))
File ~/Work/Devel/remotemanager/remotemanager/script/script.py:327, in Script.script(self, empty_treatment, **run_args)
325 value = DELETION_FLAG_LINE
326 else:
--> 327 value = sub.value # get the value in string form
328 if value is None or value == "None":
329 # no value, triage this argument
330 treatment = empty_treatment or sub.empty_treatment
File ~/Work/Devel/remotemanager/remotemanager/connection/computers/substitution.py:172, in Substitution.value(self)
166 @property
167 def value(self) -> any:
168 """
169 Returns:
170 value if present, else default
171 """
--> 172 val = super().value
174 if len(self.dependencies) == 0:
175 return val
File ~/Work/Devel/remotemanager/remotemanager/connection/computers/dynamicvalue.py:356, in DynamicMixin.value(self)
352 pass
354 val = self._format_value(val)
--> 356 self._validate(val)
358 return val
File ~/Work/Devel/remotemanager/remotemanager/connection/computers/dynamicvalue.py:441, in DynamicMixin._validate(self, value)
437 raise ValueError(
438 f"{value}{nameinsert} is less than minimum value {self.min}"
439 )
440 if self.max is not None and value > self.max:
--> 441 raise ValueError(
442 f"{value}{nameinsert} is more than maximum value {self.max}"
443 )
ValueError: 30 for b is more than maximum value 20
format
¶
Format allows you to enforce the format of the variable
The possible values are:
time
float
time
¶
This will enforce a HH:MM:SS
format for the input. By default it’s expecting integer seconds, but also accepts a string input or “semantic time”. For example:
"01:00:00" == "1h" == 3600
-> 01:00:00
"24:00:00" == "1d" == 86400
-> 24:00:00
Added in version 0.11.15.
Enabled the ability to set “semantic time” with 4h
, 1d
, etc.
float
¶
Float will enforce a value to float.
By default all numerical inputs are passed through a math.ceil
and converted to int.
This prevents two issues:
Integers are generally preferred in jobscripts, requesting resources with floats may lead to unintended behaviour
Small fractions will not resolve to
0
, requesting at least1
of the resource in question
Lets say we have a calculation that requests nodes based on the total number of tasks and the number of cores that the nodes have. In this situation we can reasonably see that we could accidentally request 0 nodes, if we ask for less than a full node:
TASKS/CORES_AVAILABLE = NODES
Resolves to 64/128=0.5
So instead of requesting nodes=0.5
(or nodes=0
), we convert this to nodes=1
However, if you want a float, you can set format=float, which will enforce that the value is printed as a float.
[11]:
template = """
time = #walltime:format=time#
num_flt = #num_flt:format=float#
num_int = #num_int#
"""
conn = Computer(template=template)
conn.walltime = "24h"
conn.num_flt = 3.0
conn.num_int = 3.0
print(conn.script())
time = 24:00:00
num_flt = 3.0
num_int = 3
static
¶
Sometimes you need to have {}
in your output, however you may have found that this causes the contents to be evaluated. If that’s the case, you can force that value to be “static” by passing that keyword.
[12]:
template = """
a = #a:default=10#
b = #b:default={a}#
c = #c:default={a}:static=True#
"""
print(Computer(template=template).script())
a = 10
b = 10
c = {a}
empty_treatment
¶
This argument dictates how empty values are treated. It can be applied individually to the parameters, or globally to the Computer
.
Since the default behaviour is to remove lines with empty values, you do not need to worry about over parameterising a jobscript. Provided you keep in mind the behaviours described below, we can envision a template that contains multiple times more content than any individual jobscript it may produce.
The possible values are:
line
ignore
local
We can demonstrate this with a parameter #NODES#
, and see what happens if we leave no value
line
¶
This is the default behaviour, and removes the whole line.
[13]:
template = "#SBATCH nodes = #NODES:empty_treatment=line#"
print(Computer(template=template).script())
local
¶
Local removes the actual arg itself, leaving the rest of the line.
Useful for when you have multiple parameters in the same line.
[14]:
template = "#SBATCH nodes = #NODES:empty_treatment=local#"
print(Computer(template=template).script())
#SBATCH nodes =
ignore
¶
This behaviour intentionally does nothing, leaving everything untouched:
[15]:
template = "#SBATCH nodes = #NODES:empty_treatment=ignore#"
print(Computer(template=template).script())
#SBATCH nodes = #NODES:empty_treatment=ignore#
Here, we can see all the behaviours in one script:
[16]:
template = """
line = #line:empty_treatment=line#
local = #local:empty_treatment=local#
ignore = #ignore:empty_treatment=ignore#
"""
print(Computer(template=template).script())
local =
ignore = #ignore:empty_treatment=ignore#
Templating¶
Lets apply the initial script to a Computer and put it to use.
Note
Again, we need to set the submitter
parameter.
[17]:
from remotemanager import Dataset
conn = Computer(template=jobscript_template, submitter="sbatch")
Arguments¶
Checking the arguments
property, we can see a list of everything that the connection is expecting to be parameterised.
This is similar to the list you get when querying required
and missing
, and also has aliases at args
and subs
.
[18]:
conn.arguments
[18]:
['tasks_per_node',
'cpus_per_task',
'nodes',
'queue',
'account',
'time',
'modules']
Lets add this connection to a dataset and set the parameters that it’s expecting.
Note
Parameters can be set at the URL
(Computer
), Dataset
, Runner
, or run()
level.
[19]:
modules = """
module load python
module load module/version
"""
def f(inp):
return inp
ds = Dataset(f,
url=conn,
account="myuser",
modules=modules,
skip=False)
ds.append_run({"inp": True}, mpi_per_node=64, omp=4, nodes=4)
ds.run(dry_run=True)
appended run runner-0
Running Dataset
assessing run for runner dataset-9ebf1589-runner-0... running
launch command: cd temp_runner_remote && bash dataset-9ebf1589-master.sh
[20]:
print(ds.runners[0].jobscript.content)
#!/bin/bash
#SBATCH --nodes=4
#SBATCH --account=myuser
#SBATCH --walltime=01:00:00
#SBATCH --exclusive
module load python
module load module/version
export DIR_7f3744ae=7f3744ae_master
source $DIR_7f3744ae/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out
And there we have a sensible jobscript.
run_args¶
It was noted earlier, but it’s worth covering in more detail. The actual arguments that are required are called “run args”, usually acessible at the run_args
property where relevant.
You can set these at multiple different levels, which each sucessive one overriding the previous.
Set location |
Note |
|
Top level setting, global, but overidden by everything. |
|
Global level defaults, will override any |
|
Specific to that runner, will override any |
|
Global, but specific to that run. Overrides all parameters. |
For more information on the specifics of run args and Datasets, see the Run Args Tutorial.
We can demonstrate this behaviour with a simple script:
[21]:
mini = Computer(template = "#TEST#")
mini.test = "conn level"
ds = Dataset(f, url=mini, skip=False, verbose=0)
ds.append_run({"inp": True})
ds.run(dry_run=True)
print(ds.runners[0].jobscript.content)
conn level
export DIR_3d536b34=3d536b34_master
source $DIR_3d536b34/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out
[22]:
ds.set_run_arg("test", "Dataset level")
ds.run(dry_run=True)
print(ds.runners[0].jobscript.content)
Dataset level
export DIR_3d536b34=3d536b34_master
source $DIR_3d536b34/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out
Here we see that the “Dataset level” setting takes priority.
Now lets demonstrate with multiple runners what happens there:
[23]:
ds.append_run({"inp": False}, test = "Runner level")
ds.run(dry_run=True)
print(ds.runners[0].jobscript.content)
Dataset level
export DIR_3d536b34=3d536b34_master
source $DIR_3d536b34/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out
[24]:
print(ds.runners[1].jobscript.content)
Runner level
export DIR_a61456f2=a61456f2_master
source $DIR_a61456f2/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-1-run.py 2>> dataset-9ebf1589-runner-1-error.out
Adding a second runner and setting a value for test
there updates it in place, for that Runner.
However now setting it at the run()
level will override even this:
[25]:
ds.run(dry_run=True, test="run() level")
print(ds.runners[1].jobscript.content)
run() level
export DIR_a61456f2=a61456f2_master
source $DIR_a61456f2/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-1-run.py 2>> dataset-9ebf1589-runner-1-error.out
Duplicate Arguments¶
The template extraction will function on the first argument found within a script, so any kwargs should be added there.
Added in version 0.11.15: BaseComputer will raise an exception if arguments are detected in any arguments but the first.
[26]:
template = """
#test:default=True#
#test:default="foo"#
"""
temp = Computer(template=template)
print(temp.test.value) # note how the value is "True", not "foo"
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[26], line 7
1 template = """
2 #test:default=True#
3
4 #test:default="foo"#
5 """
----> 7 temp = Computer(template=template)
9 print(temp.test.value) # note how the value is "True", not "foo"
File ~/Work/Devel/remotemanager/remotemanager/connection/computers/computer.py:38, in Computer.__init__(self, template, **kwargs)
34 def __init__(self, template, **kwargs):
35 # super() behaves strangely with multiple inheritance
36 # explicitly call the __init__ with self
37 URL.__init__(self, **kwargs)
---> 38 Script.__init__(self, template=template, **kwargs)
File ~/Work/Devel/remotemanager/remotemanager/script/script.py:80, in Script.__init__(self, template, empty_treatment, **init_args)
77 self._empty_treatment = None
78 self.empty_treatment = empty_treatment
---> 80 self._extract_subs()
File ~/Work/Devel/remotemanager/remotemanager/script/script.py:148, in Script._extract_subs(self)
146 if name in self._subs:
147 if kwargs is not None and kwargs != "":
--> 148 raise ValueError(
149 f"Got more kwargs for already registered argument "
150 f"{name}: {kwargs}"
151 )
152 logger.debug("\talready processed, continuing")
153 continue
ValueError: Got more kwargs for already registered argument test: default="foo"
Escape Sequences¶
Added in version 0.13.5.
Templates reserve a small selection of characters as “control characters”. An example of this is splitting arguments using the :
delimiter.
But what if we need to add one of these characters to our template? There are two ways to indicate that control sequences should be added directly.
Quotation¶
A simple method to do this is to simply quote the value. Strings inside quotations will not be parsed.
[27]:
template = """
ratio: #ratio:default="a:b"#
equal: #equal:default="a=b"#
"""
test = Computer(template=template)
print(test.script())
ratio: "a:b"
equal: "a=b"
Escaping with \
¶
The backslash () character is a standard escape sequence, and the functionality has been extended to templates.
There exists two methods of adding escape sequences to templates:
Specify the template as a “raw” string:
r"foo\:bar"
“Double-escape” the sequence:
"foo\\:bar
Lets repeat the previous example, without quotes:
[28]:
template = r"""
ratio: #ratio:default=a\:b#
equal: #equal:default=a\=b#
"""
test = Computer(template=template)
print(test.script())
ratio: a:b
equal: a=b
[29]:
template = """
ratio: #ratio:default=a\\:b#
equal: #equal:default=a\\=b#
"""
test = Computer(template=template)
print(test.script())
ratio: a:b
equal: a=b
Further control¶
We have seen hints in this tutorial that values can be linked to one another. The next tutorial will cover this in greater detail, allowing you to get the full potential out of templating.